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LEVELS AND BOUNDARIES: ANOTHER LOOK AT INTERNET CONTENT 
by Kevin Werbach 


Today's Internet is like a teenage boy’s room It's very hard to find any- 
thing, but you can be sure there's pornography in there somewhere 


In other words, the real problem with the Internet is not the type or volume 
of content; it’s the limitations of existing navigational and search tools. 
Grown-ups have difficulty preventing children from viewing pornography and 
other inappropriate materials for the same reason search engines often don't 
point you to the right site: They can’t locate it. One reason Yahoo! has a 
$5 billion market capitalization is that the Web is simply unusable without 
tools to find what you're looking for. 


There is a better way. The Net is fragmenting, both vertically (into multiple 
levels of meaning) and horizontally (into bounded online communities) 

Content labeling is part of this process. Labels first gained prominence as 
tools to enable filtering of material deemed inappropriate for children, but 
they can serve many other functions. In this issue, we examine the emergence 
of labeling and filtering systems, and 
what they mean for the future of the 
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There is significant overlap between 
the debates over privacy protection and 
content controls on the Internet (and 
we have previously linked them, see 
Release 1.0, 12-96 and 2-97). The two 
issues pose a common problem — absence 
of sufficient self-regulation to keep 
governments at bay — and common solu- 
tions — labeling and local choice. For 
privacy, labeling means disclosure of 
privacy policies, negotiation with 
users and use of trustmarks such as the 
TRUSTe system. For content, labeling 
means rating sites so that users can 
filter out the ones they consider inap- 
propriate for themselves or for chil- 
dren. 


The good news is that concerns about 
Internet content have fostered develop- 
ment of labeling infrastructures with 
benefits in many other areas. At least 
35 companies have stepped =====> 
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into the breach to offer content controls to parents, schools and companies. 
Labeling can address concerns about protecting children on the one hand, and 
government-imposed censorship on the other. 


But we're not quite there yet. Despite industry efforts, legislators con- 
tinue to propose restrictions on certain types of content. Filtering compa- 
nies have rated hundreds of thousands of sites, but out of an estimated 10 
million separate information sources on the Web. Users have few choices of 
label sets catering to specific values or interests; the recent deal between 
Cyber Patrol and the Anti-Defamation League to create a rating service for 
hate speech is a welcome development. We're also missing truly easy-to-use 
tools for creating customized labels. Labeling and filtering won't take off 
until users can rate sites with one or two clicks (in content-creation 
tools, bookmark- management applications and elsewhere) 


We're optimistic these limitations can be overcome. Filtering began as a 
way for parents to keep pornography away fromtheir kids, but significant 
new demand is coming from companies and schools eager to control what goes 
over their networks. The Extensible Markup Language (XML) and related tech- 
nical specifications will make it easier to label content in multiple ways, 
and to distribute those labels. Better search and navigational tools wil 
require information about the information on the Web, assigned explicitly or 
generated on the fly from aggregate usage patterns. People will continue to 
filter themselves into bounded online communities that track their inter- 
ests. 


Before exploring the many roads to content labeling, we begin with the lat- 
est developments in the policy realm The threat of government action has 
long driven technology in this area. Despite much progress, many of those 
who first put Internet content controls on the map are not satisfied with 
private solutions 


UPDATE FROM THE POLICY FRONT 
Son of CDA 


When the Communications Decency Act (CDA) was proposed in 1995, it galva- 
nized the online community both politically and technically. A federal 
court declared the CDA unconstitutional shortly after it was passed in 
February 1996, and in June 1997 the US Supreme Court affirmed that decision 
in Reno v. ACLU. Most of the Net community breathed a sigh of relief. The 
court decision did not, however, eliminate the concerns that fostered the 
legislation in the first place. Efforts to legislate controls on online 
speech have continued, both in the United States and elsewhere 


Senator Dan Coats (R-IN), one of the CDA’s sponsors, has introduced a new 


more narrowly written bill, limited to commercial Websites, that would pro- 
hibit them from allowing children to access material that is “harmful to 
minors.” Constitutional or not, the Coats bill uses the same basic approach 


as the CDA: Criminalizing a category of content. Several of the organiza- 
tions that fought the CDA, such as the Center for Democracy and Technology 
(CDT) and the American Civil Liberties Union (ACLU), have announced their 
opposition to the Coats bill. The full Senate could consider the bill as 
early as this month. 
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Several states have also adopted legislation to restrict online content 

New Mexico, for example, recently passed a law that bans the dissemination 
of material deemed “harmful to minors” on the Internet. The ACLU, joined by 
the Electronic Frontier Foundation (EFF), has filed suit to declare this law 
unconstitutional. The ACLU'’s Cyber-Liberties Web site lists 13 states that 
have passed laws to regulate Internet speech, with legislation pending in an 
additional 10. The argument for the futility of the CDA — that one country 
could not effectively prohibit speech on an inherently global network — 
applies with additional force to individual state laws. 


No filter, no $$ 


Senator John McCain (R-AZ), Chairman of the Senate Commerce Committee, has 
introduced the “Internet School Filtering Act” which would require schools 
and libraries to install filtering software in order to receive the new 
Universal Service Fund's benefits. This program allows schools and libraries 
receive discounts on telecommunications and information services, including 
Internet access. Vice President Gore has endorsed an alternative to the 
McCain bill that would require schools and libraries to implement acceptable 
use policies, without necessarily requiring the use of filtering software 


Filtering is particularly controversial in libraries, which straddle the 
boundary between government-operated public forums and educational environ- 
ments for children. The American Library Association has voted to oppose 
library filtering. Many individual librarians and city governments, however 
strongly support the use of filtering software, especially on computers in 
children’s sections of libraries. Last month, a federal judge in Virginia 
refused to dismiss a suit brought by ACLU and People for the American Way 
challenging mandatory library filtering in Loudon County 


Let's talk about this 


The computer and communications industries have responded to these initia- 

tives in several ways, including supporting groups such as the EFF, ACLU and 
CDT; working through the World Wide Web Consortium (W3C) to develop labeling 
technologies (see PICS, page 7); labeling many popular Web sites and educat- 
ing policy-makers 


Perhaps the most visible activities have been conferences to bring together 
industry and government representatives. The most significant was the 
Internet Online Summit, held December 1 to 3 last year in Washington, DC 
The summit featured speeches by Vice President Al Gore, Attorney Genera 
Janet Reno and Commerce Secretary William Daley, and participation from most 
of the players in the filtering and Internet content arenas. The 
Organization for Economic Coordination and Development held a similar, if 
lower-level, event in Paris on March 25. 


These events show that industry players are willing to address concerns 

about inappropriate materials on the Internet. Several organizations 
announced initiatives at the Internet Online Summit, including netparents.org 
(a resource for parents to learn about content-control technologies) and the 
American Library Association (which unveiled a collection of over 700 organ- 
ized and annotated links to “great sites” for kids) 


On June 11 to 12, the Clinton Administration and the Annenberg Center for 
Communication at the University of Southern California will host a White 
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House Internet Summit to discuss digital content for children and teenagers 
This event is billed as third in a series, following the December online 
summit and a February conference on Internet access for rural and low-income 


communities. The June conference will examine ways to make online content 
for children more widely available and easier to find. (In the long run, 
children’s experiences with the Internet will depend more on educational and 


entertaining content than on what we don’t want them to See.) 
Outside the United States 


Content issues also attract government attention outside the US. What 
counts as “inappropriate” varies from country to country. In the US, polit- 
ical material is the paradigm of speech that should be protectible under the 
First Amendment. In China, criticism of the government may land you in 

jail for subversion. Some countries, such as Singapore, are in effect con- 
structing national intranets, using proxy servers to filter all internation- 
al communications into and out of the country 


The transnational nature of the Internet may limit the effectiveness of such 
systems, especially as the number of international connection paths increas- 
es. Still, governments that really want to limit speech can do so. The 
best response will be to convince them that such actions will hamper their 
ability to engage in electronic commerce, and that voluntary alternatives 
(such as end-user filtering software) are available. This is the approach 
Ira Magaziner and the Clinton Administration have taken with their Framework 
for Global Electronic Commerce. 


The policy debate about content in the US has generally centered on the 
role of content creators. By contrast, many European countries have empha- 
sized a different side of the equation, the Internet service providers 
(ISPs). “Codes of conduct” for ISPs have been developed in several coun- 
tries by ISP trade associations, often at government “urging.” In these 
codes, ISPs typically commit to prohibit or remove materials from their 
sites that violate specified standards of decency 


Martin Bangemann, the member of the European Commission (EC) responsible for 
telecommunications and information technology, has suggested a “global char- 
ter” for the Internet. Bangemann's idea is that the governments of the 
world would negotiate baseline rules for a variety of Internet policy ques- 
tions, including content regulation. The EC formally endorsed the idea in 
February, and the United States has expressed qualified support for such an 
approach. A more detailed proposal is expected later this year. The EC 
has also allocated 10 million ecu for the INCORE (Internet Content Rating 
for Europe) initiative to develop a European rating system These efforts 
may be valuable, but in their current forms they tend towards government- 
mandated systems rather than user choice. 
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NEW ADVENTURES IN FILTERING 


We have previously covered many of the major players in the filtering busi- 
ness and their products, including Microsystems, Inc.1 (Cyber Patrol), Net 
Nanny, Net Shepherd, PlanetWeb, SafeSurf, RSACi, Solid Oak (CYBERsitter) and 
Surf Watch (see Release 1.0, 12-96). The table on page 6 compares these and 
other leading products in the filtering space. 


(Note: The numbers are not entirely reliable, because companies use differ- 
ent methodologies to calculate them. Larger numbers do not necessarily mean 
that a product is more comprehensive.) 


Since we first compared filtering products 17 months ago, the user numbers 
have shown the typical Internet growth curve. The number of rated sites has 
also increased dramatically. The claims are impossible to evaluate, espe- 
cially because virtually no company (except Net Nanny) lets users see its 


full list of blocked items. If as Steve Lawrence and Lee Giles of the NEC 
Research Institute recently estimated there are 320 million documents on the 
Web today, these products still have a way to go. On the other hand, the 


primary goal of most filtering products is to find sites that are inappro- 
priate for children, a much smaller number 


How widely are these products actually used? A study late last year spon- 
sored by FamilyPC magazine found that only 26 percent of parents surveyed 
used some form of parental control software (including the built-in features 
available through Microsoft Internet Explorer, AOL and elsewhere), and only 
4 percent employed a standalone filtering product. These numbers don’t nec- 
essarily prove that labeling isn’t working. Parents may make conscious 
choices not to filter what their children view; the same survey found that 
78 percent of parents monitor their children as they surf. AOL reports that 
45 percent of families with children enable some form of its parental con- 
trols (see page 13), which suggests that many parents will take advantage of 
tools that are simple and readily available. FamilyPC editor-in-chief Robin 
Raskin believes one of the biggest hurdles for filtering software is that 
parents feel too much work is required to install and configure the stand- 
alone products. 


The next application of content labeling will be integration of filtering 
with search engines. Net Shepherd began offering a filtered version of 
Digital’s Alta Vista search engine last fall, and in February N2H2 announced 


an agreement to provide a filtered version of the Inktomi search service 
(see page 11). These services prevent users not only from visiting sites on 
the blocking list, but also from seeing information about them in search 
results. With search engines becoming increasingly central “portals” to the 
Internet, the option of a filtered search (so long as it is only an option) 
is highly valuable. Using these tools, children engaging in research for 
school will be less likely to stumble across the wrong kind of information. 


Now a subsidiary of The Learning Company 
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MA] OR FILTERING PRODUCTS 
Company Number of Sites Source of Type 
(product) users rated ratings 
Content Advisor Launching 6/98 1,500,0002 employees server 
Kans men 500,000 desktops 300,000 empl oyees server 
(Little Brother) 
Learning Company* 2,000, 000+ 60,000 NO employees, client 
(Cyber Patrol) 50,000 YES teachers/parents server 
N2H2 4,000 schools, 400,000 empl oyees server 
(Bess) 120 ISPs 
Net Nanny 1,000,000 copies 100,0003 empl oyees client 
installed Server 
Net Partners 2,000, 000 200,000 empl oyees Server 
( WebSENSE) 
Net Shepherd* NA 500, 000+4 volunteers, client 
bureaus labe 
bureaus 
Pl anet Web* NA 50, 000+6 employees, Server 
students 
RSACi * NA 70, 000+ self-rating labe 
bureau 
SafeSurf* NA7 100, 000+ self-rating, labe 
volunteers bureau 
Secure Computing NA 250,000 employees server 
(SmartFilter) 
Solid Oak 1,500,000 100,0008 employees, client 
(CyberS!ITTER) licensed copies self-rating server 
Surf Wat ch* 8,000,000 copies 100,0009 employees, client 
installed contractors Server, 
bureau 
* Indicates ratings that are PICS-compati ble. 
2Total database entries; sites can have multiple entries. The number of 
Separate sites is much smaller, but the Company will not estimate it 
3]ncludes sites onwords and punase lis; 20,000 sites labeled, ; 
42,000,000 total URLs; over 500,000 represent the main. page of a site 
SDisbributes [labels that can be used with other companies” software 
6161,400 database records covering 50,000 complete sites. 
TPreviously offered [nternet Filterin oot Lon 
8Esti mated blocked sites. Works primarily through phrase filtering 
INumber of rates sites in database; ratings provided by others. 
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PICS developments 


The Platform for Internet Content Selection (PICS) provides a standardized 
technical format for labeling Internet content and for distributing those 
labels (see Release 1.0, 12-96). Designed as a less-intrusive alternative 
to the CDA, PICS is a standard for labels, not for content. By making it 
easier to create new labeling systems, PICS facilitates a diversity of 
labels. Although the CDA is just a memory, work on PICS has continued under 
the auspices of the W3C 


Most filtering products on the market today can read labels in PICS format 
and Microsoft Internet Explorer versions 3.0 and up include a “content advi- 
sor” feature that accepts PICS-compatible labels with RSACi as the default 


Netscape will support PICS and RSACi in a new version of Communicator to be 
released later this year.10 On the other hand, we are disappointed that 
many filtering products, especially the server-based services, still are not 


compatible with PICS. PICS reduces the cost of creating new rating systems, 
giving consumers a broader range of choices. The ability to mix and match 
different sets of labels will be important if filtering is to achieve criti- 
cal mass. In the long run, insistence on proprietary approaches will reduce 
user acceptance and hurt the filtering industry 


A controversial PICS development is PICSRules, a standard for filtering pro- 
files. A profile is, in essence, a further abstraction built on top of the 
PICS labeling architecture. Labels tell you about content, while profiles 
tell you what to do with content based on certain labels. Such profiles are 
important: A certain level of violence may be perfectly acceptable for a 
15-year-old but excessive for a 5-year-old. Similarly, individual parents 
will have different views about whether sites discussing homosexuality are 
appropriate for their children. (Remember the fight over broadcast TV rat- 
ings?) PICSRules allows parents and others to set their own criteria for 
ratings. 


Filtering software already allows users to set preferences for the types of 
content they wish to block. These profiles are limited, however, as are 
labeling systems not compatible with PICS. Without a public standard, pro- 
files cannot easily be shared across multiple applications or users 
PICSRules also makes it far easier for organizations to distribute filtering 
profiles. Groups ranging from People for the Ethical Treatment of Animals 
(PETA) to the local PTA can generate profiles and make them available to 
their members. Parents can simply load a profile developed by an organiza- 
tion they trust, without going through a time-consuming process of setting 
preferences. There is the usual tradeoff here between ease-of-use and indi- 
vidual control 


But is it censorship? 


Civil liberties organizations including the ACLU and the Electronic Privacy 
Information Center have questioned the benefits of PICSRules, and PICS 
itself. EFF has expressed concerns that PICSRules makes it easier to block 
entire domains and that the accuracy of filtering tools is being oversold. 
Beyond such legitimate issues, however, PICS critics often argue that the 
technology enables censorship. Such arguments confuse private decisions to 


10The German version of Netscape Navigator currently offers RSACi on a trial 
basis 
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filter or block Internet content with government mandates. There is a vast 
difference between giving individuals more information about the content 
they view and tools to select it, on the one hand, and governments acting 
to restrict choice on the other. 


A recent press release from Watchsoft nicely illustrates the confusion about 
filtering. Watchsoft’s Disk Tracy monitors Internet usage and allows par- 
ents to review the sites their children visit to determine whether they 


are accessing inappropriate material. Watchsoft claims this approach is 
Superior to filtering because filters are “First Amendment challenged” and 
“may be unconstitutional.” 


But the First Amendment, and the very word censorship, apply only to govern- 
ment action. It may be unconstitutional to require filtering at public 
libraries, because they are government-run institutions, but that is entire- 
ly different froma parent installing filtering software on her child's PC 
And if we're talking about a public environment such as a library, monitor- 
ing software is hardly a superior alternative. If people object to filters 
that prevent them from accessing certain sites, they will probably feel even 
more concerned about software that allows someone to keep track of all the 
sites they visited. The concern about both filtering and monitoring tech- 
nology is fundamentally about who uses it, and for whom 


To be sure, there are more nuanced critiques of filtering (see page 23). 
We're not trying here to resolve all the policy issues or to belittle First 
Amendment concerns. Our point is that the genie isn’t going back in the 
bottle; let’s try to understand the tradeoffs inherent in choices we make 


A truism in First Amendment law is that the answer to objectionable speech 


is more speech. Well, we think the answer to concerns about labeling is 
more labeling. The more different labeling systems that exist, the more 
physical governments will become just more one source of ratings among many 


on the Internet. 


Sign of the times 


There are important links between PICS and identity management (see 
Release 1.0, 2-98). If you subscribe to a labeling service, how do 
you know you're really getting what you think? A functioning label- 
ing infrastructure requires trust in the identity of those who create 
and distribute labels. W3C has formed a Digital Signature Working 
Group to develop a standard, DSig, for digitally signing PICS labels 
DSig will allow label distributors to confirm the validity of labels 
they distribute, thus enhancing the integrity of the process 
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E-mail filtering 


Content controls are about more than pornography. For example, cor- 
porations generally prefer that their employees not spend their days 
ordering bestsellers from Amazon.com, even though there is nothing 
inherently “undesirable” about the content of that site 


In other situations, the problem is not what is available, but how it 
is distributed. Take, for example, spam (unsolicited commercial e- 
mail). Some spaminvolves adult content or invitations to illega 
pyramid schemes. Much of it, however, would be innocuous if not sent 
in huge quantities to people who did not request it. The problem is 
the conflict between producers and consumers of information. The 
content originator derives some benefit from sending out millions of 
unsolicited advertisements. The marginal cost is effectively zero. 
Recipients, however, bear a cost they don’t control 


Ultimately, there are only two technological solutions: Change the 
economics, or change the architecture of the network. Both are being 
explored. Services that award people for receiving commercial e- mai 
or viewing advertisements (such as Juno) seek to shift the economic 
incentives for both senders and receivers. Some of the leading spam- 
mers have created a “Spam-friendly” Internet backbone under the name 
Global Technology Marketing, Inc., based on the idea of paying | SPs 
for accepting unsolicited commercial e-mail 


We expect more experiments along these lines, until viable business 
models emerge (see Esther Dyson's book Release 2.0, pages 116-121) 
The trouble is that it takes only a few rogue spammers convinced they 
can make money to generate a huge volume of traffic. For this rea- 


Son, economic incentives must be combined with structural changes. 


E-mail client and server software are being upgraded with new anti- 
Spam features At the same time, e-mail is evolving from an | SP-pro- 
vided service to one that is increasingly available on a standalone 
basis via the Web (through services such as Hotmail, Rocketmail, and 
Yahoo Mail). This trend will initially make it easier for spammers 
to obtain “disposable” e-mail accounts from which to launch their 
messages. In the long run, though, the evolution of e-mail architec- 


ture may be the most effective way to prevent spam 


A company called Critical Path shows one way how. Critical Path is 
building a business around e-mail outsourcing, offering a customiz- 
able Web-based front-end combined with a high-performance back-end 
Servers. It promises scalable performance and value-added function- 
ality to ISPs, Web hosting providers and companies. As part of its 
Service, Critical Path offers sophisticated spam blocking. Critica 
Path uses both a list of Spammer addresses and algorithms that iden- 
tify large volumes of identical e-mail not associated with legitimate 
mailing lists. The beauty of this approach is that it enjoys 
increasing returns — the more customers Critical Path signs up, the 
larger and more diverse a sample it has from which to identify spam 
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FILTERING ON THE SERVER SIDE 


Content can be filtered either at an individual PC or on the server level 
using either self-labeling or third-party services. Most of the origina 
filtering companies, such as SurfWatch, Net Nanny and Microsystems began by 
distributing software that users could install on their PCs. There is, how- 
ever, a significant market for server-based filtering. Schools that have 
deployed local area networks are obvious customers. Companies also employ 
content controls to prevent their employees from downloading inappropriate 
materials or conducting personal activities on company time. Finally, dial- 
up ISPs can integrate filtering products into their networks (for example, 
by reselling the Bess service described below), and offer them as value- 
added services to their clients. 


As a technical matter, server-based filtering is particularly appropriate 
for companies or schools that connect to the Internet through a firewall or 
proxy server. These organizations already route all of their incoming 
Internet traffic through a gateway, so filtering software deployed in one 
place can control access for all users on the network. Like any server- 
based application, these tools can be managed centrally without the need to 
administer individual users’ computers (although they do leave less room for 
individual variation). The server is always connected to the Internet, so 
the filtering software provider can update the ratings database automatical- 
ly. Filtering can also be combined with other functions, such as network 
security and caching 


Most of the client-side filtering software providers, including The Learning 
Company, Net Nanny, Solid Oak and Surf Watch, now offer server-based versions 
of their products. We describe below three very different companies that 
concentrate on server-side filtering. We then look at the parental controls 
that AOL, the largest ISP, makes available to its users 


Bess 

N2H2, Inc., iS a private Seattle-based Internet filtering and caching compa- 
ny founded in 1995 by the husband-and-wife team of Peter Nickerson and Holly 
Hill. The couple named their server-side filtering service Bess after their 


pet dog, a retriever, and the friendly pooch is depicted prominently on the 
Bess home page 


Nickerson is a former economics professor who wanted to keep his kids away 
from adult sites on the Internet. He found that children could easily 
evade client-based filtering products, so he began the company to develop a 
server-based filtering service. Bess uses a proxy server to block sites 
based on a database of hundreds of thousands of URLs. N2H2's employees 
review every site considered for blocking; less than half of the sites they 
examine wind up being blocked. In addition to filtering, Bess provides a 
colorful Web site with links to several dozen kid-friendly sites and Web- 
based e-mail 


The company initially focused on the consumer market, but today the bulk of 
N2H2's business is turnkey filtering services for over 4000 schools. N2H2 
installs a proxy server in the school network and then sends updates to the 
database of blocked sites into each proxy server on a daily basis. N2h2 
also provides technical support for the equipment. Pricing varies based on 
the size of the customer, from $60 per month for a small school to severa 
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thousand dollars a month for larger institutions. A similar service is 
available to businesses and libraries. 


N2H2 also sells to approximately 120 ISPs, allowing them to offer filtered 
access to their customers for an additional monthly fee. The ISP partner- 
ship program includes the proxy server as well as sales and marketing mate- 
rials; once Bess is installed at the ISP, it as a reseller and can provide 
the service to individual customers for an additional fee. Finally, users 
in Western Washington State can get filtered dial-up Internet access direct- 
ly from N2H2 for $24.95 per month. (N2H2 also operates an unfiltered | SP, 
Rainier. net. | 


The N2H2 proxy server also offers caching to improve network performance 
(N2H2 claims 30 to 50 percent of files can be retrieved locally fromits 
caches). Caching and filtering are usually thought of as separate product 
categories, but as Bess demonstrates they can easily be combined. Nickerson 
argues that caching is especially valuable in educational settings, because 
schools tend to have limited bandwidth and teachers can pre-load sites cov- 
ered in their lesson plans. One could just as easily imagine a company 
(such as Mirror Image, Inktomi or Skycache) using filtering as a throw-in to 
enhance their caching services 


Integration with caching products may also address the potential performance 
degradation when all queries must go through a filtering proxy server 
Caching and related technologies are designed to improve network perform- 
ance, so filtering built on top of caching servers should prove to be more 
scalable than standalone alternatives 


At the National Educational Computing Conference for K-12 and university 
educators in late June, N2H2 will unveil an advertising-supported filtered 
Search service built on the Inktomi search engine. Nickerson says that 
schools are increasingly taking for granted the need for filtering mecha- 
nisms as they recognize the type of material available online. Eventually, 
however, he believes the market for filtering in corporate settings may 
exceed the educational opportunities. 


SmartFilter 


Secure Computing is a 350-employee Honeywell spinoff that went public in 
November 1995 and generated $48 million in revenue last year. The company 
began as a developer of security systems for government agencies such as the 
National Security Agency, but more recently has moved into the enterprise 
Security space. As part of that transition, Secure Computing acquired 
Webster Network Technologies, one of the first server-based filtering compa- 
nies, in May 1996. Secure Computing offers a content-filtering proxy server 
called SmartFilter based on the technology acquired with Webster 


SmartFilter integrates with numerous different products, including Secure 
Computing’s own products and the Microsoft and Netscape proxy servers. The 
software is also available as a standalone proxy server for Windows NT and 
all the major flavors of UNIX. All of these implementations use 
SmartFilter’s control list of some 250,000 entries, which include both indi- 
vidual pages and complete file directories. Testing against unfiltered 
proxy logs from potential customers, Secure Computing typically finds that 
40 percent of the addresses users request are on its control list. 
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To build its list, Secure Computing first uses automated systems to scan 
documents and assign recommended categories. Employees then review and con- 
firm the classification of every site before updating the control list. The 
company is funding academic research into more sophisticated pattern recog- 
nition algorithms, using neural nets to learn from previously categorized 
sites. 


Secure Computing focuses primarily on the corporate market; schools repre- 
sent one third of customers but a smaller share of revenues. Consequently, 
the control list is broad, allowing administrators to select from 27 cate- 
gories ranging fromthe obvious (“sex,"” “drugs,” “extreme or obscene," 
“criminal skills,") to the seemingly mundane (“investing,” “humor,” 
“sports”") to the abstract (“worthless,” “non-essential”). Through partner- 
ships with Siemens and ViRT, the company offers localized versions of 
SmartFilter in German and Japanese 


” 


The latest version of SmartFilter for Secure Computing’s Firewall for NT 3.0 
can also use the control list to manage bandwidth rather than block sites. 
Network administrators can assign lower priority to sites falling into “non- 
business” categories, so that they take longer to download. Secure 
Computing’s Richard Viets says that early adopters of filtering technology 
generally wanted to block inappropriate sites, but many new customers find 
“kinder, gentler” approaches more consistent with their corporate culture 


Content Advisor 


Content Advisor branched off from Coffeehaus Networks, a Somerville, MA 
developer of database and other technology for Internet service providers 
The company developed the underlying software technology to implement the 
RSACi labels available through the Content Advisor feature in Microsoft 
Internet Explorer (hence the name). Founder Steve Shannon concluded that 
the RSACi system required too many arbitrary decisions about the level of 
sex or violence on a site, and relied too much on site creators to self- 
label. He decided to build a third-party ratings database as an alterna- 
tive. Content Advisor’s first end-user product will launch in June. 


Shannon sees technology as Content Advisor's secret weapon. Content Advisor 
uses a patent-pending distributed Web spider technology to scour the Web for 
new sites to rate. Its database currently holds about 4 to 7 million URLs 
of which 1.5 million have been rated so far. Content Advisor's staff of 6 
to 7 editors are able to classify 250,000 URLs per month into 30 different 
Subject categories. The goal is eventually to categorize everything on the 
Web. The company will accept “reasonable requests” from Webmasters to find 
out how a site is classified, but does not provide direct access to its 
database. 


Content Advisor has nonexclusive partnerships with proxy server and firewal 
companies such as Checkpoint to incorporate filtering as a plug-in feature 
The company is focused entirely on the corporate market; users can block any 
Subject area they consider unnecessary or inappropriate for normal business 
uses, 


The software is designed to harmonize with the way companies manage their 
existing firewalls. Administrators can set different blocking preferences 
for different parts of a company or time of day (So employees can visit 

sports or travel sites on their lunch hour), and can specify sites to al- 
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ways or never block. The product will be priced at $1,800 per year for 250 
users to start, and pro-rated above that 


In the future, Content Advisor plans to develop subject-specific search 
engines based on its massive database. For example, someone looking for 
sports-related information would be able to search a database limited to 
Sites classified in that category. This will significantly reduce the num- 
ber of irrelevant hits generated when searching through the entire Web. 


AOL Parental Controls 


As the largest online service and the one with the highest percentage of new 
users, America Online (AOL) has a big role (wanted or not) in protecting 
children from inappropriate material. AOL is a well-known bounded community 
(see page 22). That is, users can reach public Internet sites through AOL, 
but AOL offers a significant amount of proprietary content even if access to 
the public Internet is turned off. Equally important, AOL has contractua 
relationships with all its users and content providers. These contracts 
define what is “appropriate” for the AOL service, and give AOL the authority 
to remove material that violates its contract terms, concerning propriety 
among other things. (Although as the Matt Drudge experience demonstrates, 
this process doesn’t always work smoothly) 


AOL offers proprietary parental controls for its own content, and partners 
with The Learning Company (makers of Cyber Patrol) to offer filtered access 
to outside Websites. The simplest controls are algorithms and live forum 
monitors to prevent restricted words in areas such as chat rooms. More 
Sophisticated tools can be configured through the parental controls area, 
which in AOL version 4.0 is always available fromthe main screen. As 
noted, nearly half of families with children use these controls. 


Parents can assign password-protected screen names to each of their chil- 
dren. Four default levels are available. “18+" provides full access to AOL 
features and outside Internet sites. “Mature teen” limits access to Web 
sites deemed appropriate for the 16-to-17 age group, restricts access to 
newsgroups that allow file attachments and does not allow access to premium 
services (which involve additional fees). “Young teen” restricts newsgroup 
and premium service access, limits Web sites to those deemed appropriate for 
the 13-to-15 age group and also prohibits access to member-created or pri- 
vate chat rooms. Finally, “Kids Only,” as the name implies, restricts users 
to AOL's Kids Only channel, and also disables instant messages, chat rooms 
and file attachments to e-mail. All of these settings can be customized 
Many parents, however, prefer the simplicity of the default settings 


AOL enhances its parental controls as it adds new functionality. For exam- 


ple, parents can prevent their children from receiving instant messages, 
except from an approved list of users 
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Filtering and consumer access devices 


personal computers. These include dedicated Internet 


players and screen phones (not to mention the Pentium 
refrigerator announced last month by V Sync Technol ogy 


rates elsewhere), there is a huge market for such low- 
appliances 


can be a particular concern. Server-based filtering 
in the appliances themsel ves 


Planet Web, which makes Internet software for a variety 


for children (see Release 1.0, 12-96). 


Planet Web's new CEO, Jan Gullett, came from Broderbund 


lenges to operating an effective parental control syst 
way to ensure that inappropriate sites are blocked is 


acceptable sites is a time-intensive and costly proces 


labeling bureau as a significant limitation. The most 
independent labeling standard, RSACi, relies on self- 
Gullett believes does not provide sufficient assurance 
cy and consistency of labels. 


From our perspective, the more label bureaus the merr 


party labels. On the other hand, de facto “official” 
come close to a government-mandated system  Gullett’s 
makes sense if users have options and understand what 
mean. 


Internet access is increasingly available through devices other than 


clients such as 


WebTV, cable set-top boxes, Sega Saturn video game machines, DVD 


|l-enabled 
of Japan). 


With PCs in only 45 percent of US homes (and even lower penetration 


cost Internet 


Families with children are a target market, so inappropriate content 


s usually the 


best approach here, due to the limited processing power and storage 


of embedded 


processor devices, offers an integrated parental control service that 
limits access using a database of 50,000 sites considered acceptable 


Software, so 


he is familiar with the ratings process used in the computer gaming 
world. According to Gullett, there are significant economic chal- 


em. The only 
to block unrat- 


ed sites, but reviewing sites for inclusion in the “white list” of 


Sm 


Gullett also sees the absence of an industry-backed third-party 


prevalent 
abeling, which 
of the accura- 


eine RSACi 


fills a niche and so would a similar organization based on third- 


ratings would 
idea only 
those options 
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METADATA — THE INFRASTRUCTURE FOR LABELING 


What does the future hold for Internet content controls? The next step 
beyond PICS and the current crop of filtering tools is a generalized metada- 
ta architecture. Metadata (information about information) is important, and 
not only because it facilitates effective parental controls. 


The Internet is a network of machines, but Internet content is generally not 
in a format that machines can understand. Confronted with an electronic 
copy of the Bible, a computer may be able to tell you the title, the number 
of words and, using algorithms, something about the content. But the com- 
puter probably couldn't tell you that much of the world considers it a 
Sacred text, and it might have a hard time deciding whether it was appropri- 
ate for children (there's an awful lot of sex and violence in there) 


The solution is to provide more information about content in the form of 
labels, which describe human-readable content in machine-readable form A 
computer doesn’t need to read the Bible to decide whether to block it; the 
computer need only read the associated label and apply a pre-defined set of 
rules. Moreover, once a labeling infrastructure is in place, it can be used 
for many other functions. Bar codes increase the speed and functionality of 
everything from package shipping to grocery shopping, because they put data 
about objects into machine-readable form. Done right, labeling can bring 
the same benefits to Internet content 


Labels are a form of metadata. Movie reviews are metadata, as are the 
ToolTips that appear whenever you hold your mouse above a toolbar icon in 
most Microsoft applications. Metadata is most useful when there are well- 
defined standards for its expression, association with underlying data, dis- 
tribution and interpretation. 


The World Wide Web Consortium (W3C), recognizing the recurrence of metadata 
across several of its activities, began a coordinated metadata project in 
June 1997, 


PICS is a metadata framework optimized for a specific domain — rating 
Internet content to allow filtering of inappropriate material. The same 
kind of architecture, however, could make content easier to search, or could 
provide ownership information or license terms for intellectual property 
purposes. Communities of interest have done significant work to build 
domain-specific architectures, but a standard framework for building metada- 
ta would be much more powerful 


As W3C Director Tim Berners-Lee puts it, “metadata is data.” In other 
words, we can have metadata about metadata, and on and on forever. Many 
applications will require only one descriptive layer on top of the content 
itself, but in other cases it will be beneficial to build more compl ex 
structures. With metadata, the Net is no longer flat, but becomes a flexi- 
ble, multi-dimensional hierarchy of abstractions built upon abstractions 
linked together in an infinite number of ways. Oh, and it gets easier to 
label Web pages 


XML 


Sophisticated metadata requires a metalanguage. Different user communities 
will want different information about documents, but the means of encoding 
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these linkages must be standardized. The Extensible Markup Language (XML) 
serves this function. XML, which became a W3C recommendation in February, 
is a framework for creating specialized markup languages for pages and 


Routers and labeling 


Just as Web content is more useful when surrounded by metadata, 
Internet packets can be more efficiently routed when not simply an 
undifferentiated mass. Cisco's “tag-switching” technology identifies 
and tags “flows” of associated traffic, such as a streaming video 
clip. Instead of examining each packet from scratch, routers need 
only read the tag, enabling faster routing. Tag switching will also 
help address the great looming challenge of Internet architecture 
Differentiated quality of service. As with Internet content, label- 
ing adds levels of meaning, making possible many different features 


Other router vendors have developed similar technologies, and the 
Internet Engineering Task Force is currently developing an industry 
Standard, called Multiprotocol Label Switching 


other objects on the Web. Stated more concretely, it allows content cre- 
ators to employ an unlimited number of tags, rather than being limited to 
those in the Hypertext Markup Language (HTML) specification. Among other 
things, these tags can provide information about content (aka metadata) in a 
standard way. 


Most of the advantages and disadvantages of the Web as a content distri bu- 
tion medium stem from HTML. Tim Berners-Lee developed HTML based on the 
International Standards Organization's Standard Generalized Markup Language 
(SGML). We have discussed SGML, and its relationship to HTML, at some 
length in the past (see Release 1.0, 6-91, 9-94). One of the significant 
benefits of SGML, and to a lesser extent HTML, is that it allows metadata 
to be associated with documents. The following line in a Web page 


<title>Lakers Win 1998 NBA Championship</title> 


tells any browser that the information between the two bracketed tags is the 
title of the page. 


The problemis that different applications call for different forms of meta- 
data. A scientist coming upon a scholarly paper might want to know the 
author, the title and the section headings. Such labels, however, don’t 
register violence or porn. SGML allows many different types of markup, 
because every document must be associated with a document type definition 
(DTD). A DTD for Department of Defense procurement documents looks very 
different froma DTD for molecular biology research. The downside is that 
creating DTDs can be cumbersome, especially for non-technical applications 


HTML does not require separate DTDs, and sharply limits the available markup 
elements. This simplicity has facilitated the explosive growth of the Web 
On the other hand, the small number of elements constrains the possible 
applications. Metadata systems based on HTML, such as PICS, cannot adapt to 
new requirements without revisions to the standard, and to client software 
based on the standard. Thus, for example, PICS expresses labels numerically 
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to facilitate internationalization, and as a result a PICS label cannot eas- 
ily contain the title of a book 


Effective Web metadata requires an alternative to HTML that provides the 
flexibility of SGML without the complexity. XML is designed to do just 
that. XML has already become something of this year’s Java, a poorly-under- 
stood technology over-hyped as the Next Great Thing. As with Java, the 
reality is more complicated 


XML frees Web developers from using the limited set of tags available from 
HTML (or proprietary “extensions” to HTML that Netscape and Microsoft seem 
to introduce with each new browser version). Every XML document can have 
its own DTD, which defines the available set of tags. Because XML is a 
Subset of SGML, existing SGML DTDs will work under XML. 


An XML document looks very much like an HTML document; the primary differ- 
ence is the broader range of tags. XML is not fully backward compatible 
with HTML, because HTML allows a certain degree of sloppiness in coding that 
can choke XML parsers. However, it’s relatively easy to turn HTML documents 
into well-formed XML documents, and browsers can easily be designed to read 
both formats. 
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Resource Description Format 


The Resource Description Framework (RDF) is another component of the 
W3C metadata activity. RDF is a framework, based on XML, designed 
specifically for metadata. RDF was heavily influenced by PICS, but 
it can be used for any type of metadata. The RDF specifications are 
currently in working draft form (under the direction of W3C'’s Ralph 
Swick). Some details may change, but overall the standard works as 
follows. 


Conceptually, RDF defines descriptions as triplets. For example, we 
would indicate that Bill Gates is the CEO of Microsoft as follows 


[Microsoft] -CEO> “Bill Gates” 


The first item describes the specific resource to which the metadata 
applies. The second itemis a property of that resource, and the 
final itemis the value of that property. Properties are defined by 
schemas, which adapt the general structure of RDF to specific subject 
areas. 


If we want to make it easier to search the Web, we might use a 
schema such as Dublin Core, which the Online Computer Library Center 
and others developed for that very purpose. Or we could use a 
schema for labeling inappropriate content for children, otherwise 
known as PICS. 


[ http://www. genericpornosite.com] —PICS Label > “RSACi s3” 


ln other words, the site has a PICS label of s3 (“frontal nudity”) 
under the RSACi standard. RSACi is, in effect, a subset of PICS, 

but PICS can be used independently of RSACi. The RDF schema specifi- 
cation will make it possible to define hierarchies of schemas, and 
constraints on the values possible under any of those schemas 


A powerful feature of RDF is that it is recursive (as noted, metadata 
is data). A description can also be a resource subject to asser- 
tions: 


[PI CSLabel] -Statement By > “Kevin Werbach” 


In other words, Kevin Werbach stated that a certain PICS label should 
be associated with a page 


The current version 1.1 implementation of PICS doesn't use RDF, 
because it was developed before RDF and XML were created. However, 
W3C plans to migrate PICS to RDF in version 2.0. 
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In practice, RDF metadata will be encoded in XML. XML documents wil 
reference schemas using the XML namespace facility, which is stil 
under development. Such documents will begin with pointers to the 
RDF specification and to whatever schemas the document employs. A 
schema can be written as a DTD, but an XML document can also refer- 
ence multiple schemas and combine elements from all of them 


RDF metadata can be embedded in a document (e.g., in HTML), supplied 
by the transfer mechanism (e.g., the HTTP server), or transported as 
a separate resource. The last method is particularly significant for 
content filtering tools — independent metadata “service bureaus” 
(what PICS calls label bureaus) can assign labels to documents and 
Serve them to users. PICS supports third-party labeling today, but 
RDF will expand the functionality of such structures. RDF-based 
Service bureaus can distribute not only content ratings, also but 
privacy information under the P3P standard (see Release 1.0, 4-98), 
intellectual property information, keywords for search engines and 
other types of labels. 


As with any assertion about content, labels require trust on the part 
of users. RDF therefore allows labels to be signed. “Signatures” 
can be as simple as a statement that the label applies as of a cer- 
tain date. On the other end of the spectrum, RDF labels can employ 
full-blown digital signatures using public key cryptography (see page 
7 and Release 1.0, 2-98) 


RDF seems likely to catch on because it is so flexible. Browser 
vendors and other companies can incorporate one standard, which can 
interoperate across different platforms and different substantive 
domains. The demand from many different communities can be combined 
to help the protocol reach critical mass. 


One way this might happen is as follows. Everyone wants better 
Internet search engines. If all major sites were labeled with the 15 
“card catalog” elements of Dublin Core, searching would be much more 
efficient. Therefore, the search engine companies, ISPs, browser 
companies and others should want content creators to label their 
sites in a standardized way. RDF could help make that possible. 


Once RDF is broadly implemented and labeling becomes a standard part 
of creating sites, it gets much easier to envision widespread label- 
ing of inappropriate content. Perhaps the leading vendors of 
Website-creation tools could make labeling a default step in creating 
a page? This would all be voluntary, of course, but any step that 
reduces the transaction costs of labeling will make a difference 
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Alexa Internet 


PICS, XML and RDF are the plumbing for an effective metadata infrastruc- 
ture. But we also need user interfaces. The broad consumer and corporate 
market will require horizontal, user-friendly tools, and not just for fil- 
tering software. PICS and RDF mean application developers can incorporate 
one piece of code to enable a vast range of labeling systems. 


A good example is Alexa (see Release 1.0, 12-97). Alexa Internet was 
founded in April 1996 by Brewster Kahle and Bruce Gilliat. Kahle was the 
developer of the Wide Area Information Server (WAIS), an Internet search 
technology for the Connection Machine that predated the Web. Before that 
he spent six years as the lead engineer for massively parallel computer 
vendor Thinking Machines 


Alexa grew out of a project of Kahle’s called the Internet Archive. The 
Internet Archive maintains snapshots of almost the entire Web at different 
points in time, beginning in early 1996.1! Currently, the archive compris- 
es 10 terabytes worth of data. The Internet Archive is a non-profit 
organization designed to make the Web accessible for historians and 
researchers, although copyrighted material is protected. Alexa is a for- 
profit enterprise to make use of the data in the archive. 


The Alexa software places a small floating toolbar at the edge of a user's 
browser window. From the toolbar, users can get information about the 
site they are visiting, see suggested links to other sites, access pages 
from the archive that are no longer available or use reference tools pro- 
vided by Encyclopaedia Britannica. 12 Alexa’s suggested-link feature tracks 
the paths that its users follow through the Internet, and aggregates that 
information using collaborative filtering technology. Based on this 
information, combined with data mining fromits huge archive, Alexa sug- 
gests sites related to the current one. 

Alexa provides several forms of “explicit” metadata about sites, including 
contact information about the owner (drawn from the InterNIC registration 
information); a tally of ratings from other Alexa users; the relative 
speed of the server; the presence of a TRUSTe trustmark; and, last but not 
least, RSACi labels. Alexa doesn’t prevent users from accessing any 
Sites; it simply indicates whether a site has been labeled with RSACi, and 
how it has been labeled 


Alexa also renders “implicit metadata” about sites explicit, a process 


Kahle describes as “global peer review.” Kahle believes that as the Web 
grows such navigation aids will be essential. Explicit self-labeling, 
although useful for many purposes, will never be universal. By analyzing 


usage tracks, Alexa can gain information about sites without conscious 
labeling. As with any collaborative filtering application (see Release 
1.0, 11-96), the accuracy of Alexa’s link suggestions improves as the num 


11The archive doesn't include everything on the Web - there is simpy too 
much stuff, distributed across too many servers, to locate and store it 
all. Some web content changes too frequently, and some sites hide from 
navigational services (such as Alexa) 

l2Etoile, which owns the Encycopedia Britannica, is also an investor in 

Alexa Internet 
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ber of participants increases; Alexa claims over 100,000 users since its 
September 1997 launch. 83 


There is no technical reason Alexa couldn’t be extended to block sites with 
certain ratings. Alexa (the company) sees its role as providing more infor- 
mation to users, rather than limiting choices, and therefore has no plans to 
incorporate such functionality. On the other hand, Alexa does filter out 
pornographic sites when suggesting links from non-pornographic sites, to 
prevent users from inadvertently stumbling into such materi al 


An Alexa-like application with built-in filtering would have several advan- 
tages. It could be available for free, as Alexa is today, because revenues 
are derived from context-sensitive advertisements on the toolbar. That 
would help address some of the distribution concerns about existing, stand- 
alone filtering software. Because Alexa can download updates automatically, 
additional labeling bureaus could be used on the fly. 


Finally, labeling could be combined with collaborative filtering. In addi- 
tion to voting on whether they like a site, users could assign content rat- 
ings to sites they visit. Hundreds of thousands of users should be able to 
label more sites than the dozens of college students that filtering software 
companies employ. Ratings would become more reliable over time as the num- 
ber of raters increased. Users could also divide themselves into subgroups 
to label sites based on a tailored rating system Net Shepherd has done 
something similar with a “rating community” of several thousand people (see 
Release 1.0, 12-96). Alexa plans to explore other means for users to anno- 
tate or rate sites, but only for informational purposes 


Metadata in Mozilla 


Unlike Microsoft Internet Explorer, Netscape Navigator and 
Communicator do not currently support PICS as a built-in feature 

But because Netscape has made the source code for Navigator freely 
available as Mozilla (the original, pre-release name for the product 
see Release 1.0, 3-98), programmers outside Netscape can now inte- 
grate new features into the browser. One such group, led by 
Netscape’s Ramanathan Guha, Robert Churchill and John Giannandrea, is 
implementing XML and RDF in Mozilla. 


RDF in Mozilla will serve several functions. In particular, RDF wil 
underlie Aurora/NavCenter, a unifying interface for managing book- 
marks, search results, file systems, ftp sites and more. Even if the 
team implementing XML and RDF has no particular interest in enabling 
content filtering, Netscape’s open-source approach will make it simple 
for someone else to build that functionality into Mozilla. 


13Alexa is a good citizen when it comes to privacy. The service aggre- 
gates user paths but does not store information about individual users 
Alexa also supports TRUSTe (see Release 1.0, 4-98) and makes its privacy 
policy readily available. 
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FEEDBACK LOOPS 


Architecture influences policy, and policy shapes architecture. It’s al 
about feedback loops. In this section, we take a few steps back to get some 
perspective on the future of Internet content controls. 


Bounded spaces 


Mutually-reinforcing pressures are leading to more boundaries and more cen- 
tralization on the Internet. Boundaries combat free-rider problems and the 
tragedy of the commons: Open space often becomes over-used or monopolized 
by a small number of individuals. (For an extended discussion of what this 
means for law, see Release 1.0, 2-96.) Bounded online spaces are not limit- 
ed by geography and can therefore overlap. Companies, however, have incen- 
tives to make the boundaries around their communities less permeable. They 
can capture more revenue, directly or indirectly, if they keep users in 
their own space rather than let them jump around. Amazon.com, for example, 
offers one-click ordering to make life simpler for its customers, but also 
to make it relatively easier for an existing customer to order via 
Amazon.com than Barnesandnobl e. com. 


What does all this have to do with Internet content controls? The answer is 
that content can be controlled much more easily in bounded spaces than in 
unbounded spaces. There is simply too much material to rate everything on 
the open Internet, and new sites are added too quickly for any rating 
process to keep up. Moreover, not everyone will agree on how to rate a site 
or what to do with a given rating. Labeling systems can never be perfect 
They must employ rule-like precision to avoid inconsistency, but flexible 
standards to capture the nuances of individual sites. (See Jonathan 
Weinberg’s 1997 Hastings Communications/Entertainment Law Journal article, 
Rating the Net, for a good analysis of this point) 


The best way to address this conundrum is to change the setting. The limi- 
tations of labeling are diminished in a bounded community, where people vol- 
unteer to join and can keep others out. In effect, communities themselves 
are filters. Users can choose communities that reflect their own values and 
those communities can tailor content to their user base (just as US law 
defines obscenity based on the prevailing standards of the local community) 
As Danny Weitzner of the Center for Democracy and Technology puts it: “Your 
view of the Web will be based more on who you trust than what you stumble 
into.” Parents who are worried about pornography may choose a community 
that blocks all unrated material. Others will be comfortable knowing that 
an unrated site might be inappropriate, because they want to allow their 
children to explore a wider variety of sites. 


The distribution challenge 


The problemis getting from here to there. Users may not even realize which 
online communities they have joined. Labels and content control tools must 
be widely available, but that requires the right incentives. Distribution 

is relatively easy in institutional settings, such as schools, libraries and 
corporations. In those cases, a local central authority can deploy a fil- 
tering system, especially one that is server-based, and content controls wil 
be available for all users in the organization. Some of these settings 
(libraries in particular) raise difficult policy questions that go beyond the 
scope of this issue. We are focusing here not on whether communities should 
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choose to filter, but on how technologies can enable certain choices 


Distribution becomes a bigger barrier for home Internet users. Millions of 
copies of filtering software have been installed, but tens of millions of 
Internet users still do not have such software. The answer comes back to 
boundaries and communities. A dial-up Internet user is not just an individ- 
ual node connected to an amorphous cloud called “the Internet.” Many points 
in the infrastructure can serve as bottlenecks, for better or worse, where 
content control technologies can be inserted. ISPs can play this role, but 


so can browser vendors, search engines, online communities, operating sys- 
tems, PC vendors and others. Every Internet user should have filtering as 
an installed option, whether or not he or she chooses to use it 


Distributing the labels may be the biggest challenge. PICS, XML and RDF 
will address the technical questions, but these standards alone do not cre- 
ate incentives to label. Companies such as N2H2, Secure Computing and 
Content Advisor want to rate as many sites as possible, but none of them 
make their ratings databases publicly available for use in other products 
RSACi relies on sites to self-rate, and many site creators see no need to 
do so. There is a place for both these models, but others are needed. In 
particular, we would like to see more universal support for PICS among fil- 
tering products, so that users can decide which ratings they prefer. 


Ratings will become much more prevalent if broader metadata initiatives suc- 
ceed. There will be many opportunities for companies to add value on top 
of labels. These applications, beginning with more powerful search engines, 
will create more pressure for content creators and third parties to rate 
Sites. Such feedback loops are not yet fully developed. Companies includ- 
ing zNet Shepherd, N2H2 and Content Advisor are moving from filtering into 
Searching. Organizations concerned about effective searching, such as the 
Online Computer Library Center, are turning to labeling. We hope that 
before long they meet in a broad middle 


The devil we know 


Finally, a cautionary note. Lawrence Lessig of Harvard Law School (better 
known as the erstwhile and pending special master on the Microsoft antitrust 
case) has articulated an interesting challenge to PICS. Lessig argues that 
PICS, by standardizing the infrastructure for filtering, facilitates censor- 
ship. In fact, he claims that PICS is actually more intrusive than the 

CDA. 14 Lessig’s basic point is that PICS changes the Web froma “non-dis- 
criminating” medium (accessible to all) into one capable of restricting 
access. PICS, he points out, enables “zoning” not only to keep pornography 
away from kids, but to achieve any other goal, including restricting “sub- 
versive” material. Moreover, PICS enables invisible filtering anywhere in 
the network. While legislation such as the CDA may restrict the provision 
of certain types of speech to children, it stops there. Lessig's pithy 
Summary is that “PICS is the devil.” 


In essence, Lessig’s argument is about transaction costs. PICS is dangerous 
because it makes it easier for governments to censor. Lessig thinks govern- 
ments are likely to mandate labeling and filtering, because voluntary 
approaches will never be completely effective. Not all sites will be 
labeled, and some people will deliberately mis-label their sites. Govern- 


14| n First Amendment law, the general rule is that government can restrict 
indecent speech so long as there is no less burdensome means to do so 
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ments, Lessig believes, will see mandated labeling and filtering, combined 
with prohibitions against false labeling, as the only effective solution. 


Lessig’s conclusion is that there must be a more thorough debate about the 
consequences of technologies such as PICS. We agree. Intelligent efforts 
to understand, balance and most importantly disclose the tradeoffs of dif- 
ferent technologies are essential to effective self-governance. This does 
not mean, however, that we should avoid the whole enterprise. Virtually al 
technologies make repression possible. As Mike Godwin of EFF has noted 
computer databases make it much easier to track and repress political dissi- 
dents, but few would argue against their development. We should condemn the 
repressive things that governments do with technology, not the technologies 
themselves. Precluding the development of technologies that enable censor- 
ship can itself be seen as a form of repression, because it means telling 

software developers what kind of code they can write 


Furthermore, Lessig’s “discriminating media” are in essence what we have 


called “bounded communities.” Boundaries are by nature exclusionary. As 
discrete communities increasingly define people’s online experiences, the 
Internet will serve as less of a global public park. Something will be 

lost in the process; but other things will be gained. In any event, the 


growth of bounded communities is an inevitable consequence of the scaling of 
the Internet. 


Content controls are here to stay. As a result, the Internet may become 


richer and more complex, but only if companies and users seize the new 
opportunities that these technologies present 


COMING SOON 


ISP pricing and quality of service. 
Audience behavior measurement 
Home-based local area networks 

And much more... (If you know of any 
good examples of the categories listed 
above, please let us know 
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RESOURCES & PHONE NUMBERS 


Brewster Kahle, Kelly Ransom, Alexa Internet, (415) 561-6793; fax, (415) 
561-6795; brewster@alexa.com, kell y@alexa.com 

Bill Dunn, Alexa Internet, (908) 832-0921; fax, (908) 832-2493; 
bdunnl@aol.com 

Bill Burrington, America Online, (202) 530-7880; fax, (202) 530-7879; 
bill bur@aol.com 

Lorrie Faith Cranor, AT&T, (973) 360-8607; fax, (973) 360-8809; lorrie@ 
research.att.com 

Danny Weitzner, Center for Democracy and Technology, (202) 937-9800; fax 
(202) 637-0968; djw@cdt.org; www.cdt.org 

Steve Shannon, Content Advisor, (617) 628-4900; fax, (617) 628-4333; 
steve@contentadvisor.com 

David Hayden, Critical Path, (415) 543-2800; fax, (415) 543-2800; 
hayden@criticalpath. net; www.critical path. net 

Mike Godwin, Electronic Frontier Foundation; (212) 317-6552; mnemonic@ 
well.com; www.eff.org 

Barry Steinhardt, Electronic Frontier Foundation, (415) 436-9333; fax, (415) 
436-9993; barrys@eff.org; www.eff.org 

Robin Raskin, FamilyPC, (413) 582-9200; fax, (413) 582-9070; rraskin@zd.com 

Jonathan Weinberg, Federal Communications Commission, (202) 418-2030; fax, 
(202) 418-2807; j weinber@fcc. gov; www.fcc. gov 

Lawrence Lessig, Harvard Law School, (617) 495-8099; fax, (617) 495-4299; 
lessig@pobox.com 

Christine Varney, Hogan and Hartson, (202) 637-6823; fax, (202) 637-5910; 
cvarney@hhl aw.com; www. hhl aw. com 

Peter Nickerson, N2H2, (206) 336-1501; fax, (206) 336-1556; peter@n2h2.com 

Becky Burr, National Telecommunications and Information Administration 
(202) 482-2581; bburr@ntia.doc. gov; www. ntia.doc.gov 

Jan Gullett, PlanetWeb, (650) 903-7000; j gullett@pl anetweb.com 

Richard Viets, Secure Computing, (941) 261-5503; fax, (941) 261-4425; 
richard viets@securecomputing. com 

Paul Resnick, University of Michigan, (313) 647-9458; presnick@umi ch. edu 

Bill Holbert, WatchSoft (Disk Tracy), (281) 282-0188; fax, (281) 282-0926 

Joe Reagle, Ralph Swick, World Wide Web Consortium, (617) 253-2613; fax, 
(617) 258-5999; reagle@w3.org, swick@w3.org; www. w3. org 


Except as noted otherwise, all companies’ Websites are at the likely 
address, http://www. domain_name. com 


For further reading: 


Lorrie Faith Cranor and Paul Resnick, “Technology Inventory,” 
www.research.att.com/~lorrie/pubs/tech4kids/ (describes available 
parental controls) 

Lawrence Lessig, “What Things Regulate Speech,” Jurimetrics (forthcoming 
1998); draft available at cyer. harvard. edu/works/lessig 
/what_things.pdf. 

Jonathan Weinberg, “Rating the Net,” 19 Hastings Comm/Ent L.J. 453 (1997); 
www. men. com/ ~weinberg/rating. htm 
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RELEASE 1.0 CALENDAR 


May 26-29 


June 3-6 


June 8 


June 20-25 


June 23-24 


June 24-26 


July 19-21 


July 19-22 


July 20-24 


July 26-30 


Oct 11-13 
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*#Harvard Conference on the Internet and Society 
Cambridge, MA. A wide-ranging look at the social impli- 
cations of cyber space. Call Beverly Freeman, (617) 
432-1638; cybercon@sph. harvard. edu 
International Design Conference in Aspen (the 48th 
annual) - Aspen, CO. Organized by IDCA. The annual 
de-sign conference, cutting across all design disci- 
plines. Call (970) 925-2257; fax, (970) 925-8495; 
idca@csn.net; www.idca.org 
EPIC Cryptography and Privacy Conference - Washington, 
DC. Sponsored by the Electronic Privacy Infor-mation 
Center, the Harvard Information Infrastructure Project 
and the London School of Economics. Conference agenda 
and registration information are available at 
www. epic.org/events/crypto98. 
Ed-Media/Ed-Telecom '98 - Freiburg, Germany. Organized 
by Association for the Advancement of Computing in Edu- 
cation. Focus on educational multimedia, hypermedia 
and telecom. Call (804) 973-3987; fax, (804) 978-7449; 
aace@virginia. edu. 
Digital Kids '98 - San Francisco, CA. Sponsored by Ju- 
piter Communications. Explores the online migration of 
child-and family-oriented software, games, and educa- 
tional content. Call (800)488-4345 Fax(212) 780-6075 
Hema@j up. com 
*W TI: Technology Summit '98 - Santa Clara, CA 
Organ-ized by Women in Technology International. With 
Esther Dyson. Call (800) 334-9484 or (818) 990-6705; 
fax, (818) 906-3299; conference-info@witi.org. 
#Spotlight '98 - Laguna Niguel, CA. Sponsored by | DG 
Michael Schrage hosts discussions on new business mod- 
els for media and technology. With Barry Diller, Joe 
Nac-chio and David Dorman. Call (800) 633-4312; fax 
(650) 286-2750; conference registrar@idg.com 
ISA Summit '98 - Los Angeles. Sponsored by Interactive 
Services Association. Call (301) 495-4955; fax, (301) 
495-4959; isa@isa.net 
*I NET '98 - Geneva. Sponsored by the Internet Society 
Over 3000 expected. Call Mark Measday, 41 (22) 344-64- 
64; fax 41 (22) 345-92-58; measday@osmarian. ch. 
Fifteenth National Conference on Artificial Intelligence 
Madison, Wl. Sponsored by AAAI. Also the tenth an- 
nual Innovative Applications in Al program. Write 
Char-les Rich, rich@merl.com 
**EDventure’s High-Tech Forum - Copenhagen, Denmark. 
Spon~sored by EDventure Holdings. Call Daphne Kis, (212) 
924-8800; fax, (212) 924-0240; daphne@edventure.com 
www. edventure. com 
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November 4-6 Human Resource Technology Conference and Exposition 
Philadelphia, PA. Latest on knowledge management and cor- 
porate intranets. Sponsored by L.R.P. Call (800) 727- 
1227; fax, (703) 739-0489; Irpconf@rp.com More informa- 
tion at www.!Irp.com/Conferences/conferences. htm 

November 16-20 COMDEX/Fall '98 - Las Vegas, NV. Over 2,000 exhibitors, 
over 10,000 new products, and 200,000 attendees from over 
100 countries. Registration information available at 
www. Comdex. com 

November 19-21 Annual Conference on Technology & Society: Washington, 
D.C. vs. Silicon Valley - San Jose, CA. Cosponsored by 
the CATO Institute and Forbes ASAP. Featuring Eric 
Schmidt, Scott Cook, and others. Contact Bethany Blue, 
(202) 789-5203; fax (202) 371-0841, bblue@cato.org 


* Events Esther plans to attend. 
@ Events Jerry plans to attend 


Lack of a symbol is no indication of lack of merit 


The full, current calendar is available on our Website (www. edventure.com). 
Please let us know about other events we should include. — Mari Katsunuma 
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